Using the Matrix Program

This section provides information for using the Matrix program.

Topics include:

Introduction to the Matrix program

The Matrix program processes zonal data and matrices according to specified expressions. Zonal data and matrices can be input, and matrices and reports can be output. Various file formats for both input and output are supported. There are no default processes; you must specify what is to be accomplished. This program is also used when invoked as a special function via RUN PGM= FRATAR, GENERATION, or DISTRIBUTION. It is used for the following purposes:

Computation of new matrix values
Trip distribution (called by Distribution program)
Trip generation (called by Generation program)
Converting and merging matrices between various formats
Reporting values from matrices and zonal data:
- Selected rows
- Marginal summaries (trip ends, etc.) Frequency distributions
- User formatted files
Transposing matrices
Generating matrices
Renumbering, aggregating, and dis-aggregrating matrices

Almost any and all of the above processes can be performed in a single application. There are some restrictions when used as a special function (Fratar, Distribution, and Generation programs).

The program processes within an overall origin zone loop controlled by the variable named I. The remainder of this document refers to this loop as the I-loop. The I-loop begins with I set to one and continues monotonically until the highest zone number is processed. When the Distribution program calls the Matrix program, I-loops are nested within iteration loops. (See Distribution Program for details.)

The standard input is comprised of definition, computational, reporting, and flow control statements (illustrated below). Definition statements are those that define the input and out files, and are, in most cases, processed outside of the I-loop. Computational statements are those that cause data to be revised. Flow control statements provide the capability to iterate through portions and conditionally or unconditionally branch to a different place, within the I-loop.

When all control statements have been checked for basic syntax, and have been stored in the control stack, the program builds a list of required variables. The input files are opened; zonal data files are read and stored, and other housekeeping is performed. If any input matrices need to be transposed, they are transformed to a single file and made ready for subsequent retrieval. It then starts the I- loop, reads the matrices for I, and processes the control stack from the beginning. When the end of the stack is reached, the program performs some end-of-zone summaries, and writes any matrices requested. When the I-loop completes (or is terminated), any requested reports are printed, and the program exits.

All variables are initialized to zero (or blank) before the I-loop, and are thereafter altered only by computational statements or internal processing. In addition to any variables explicitly specified by the users, there are certain built-in variables that can be referenced. The built-in variable are usually protected; the user is not allowed to alter their values.

Subsequent topics discuss:

Built-in variables

Matrix program built-in variables

FIRSTZONE - Stores the zone number of the first zone being processed. In a normal run this is 1. Under intrastep distributed processing with CUBE Cluster, this is the first zone number of the zones to be processed on the current CUBE Cluster node.
I - The current row zone.
ITERATION - The iteration number; usually 1.
J - A column index, usually 1, and varies (1-Zones) for MW[] and LOOPs.
LASTZONE - Stores the last zone number to be processed. In a normal run this would be the same as ZONES. Under intrastep distributed processing with CUBE Cluster, this is the highest zone number to be processed on the current CUBE Cluster node.
LOWCNT - Result of LOWEST();
MW[] - A work matrix; see COMP for more details.
RECI.NUMFIELDS - Holds the number of fields on the input record file.
RECI.NUMRECORDS - Holds the number of records in the input record file.
RECI.RECERR - Holds the current error status of the input record file.

0 indicates that there is no data error in the records read.

1 indicates that there is data error in the records read.
RECI.RECNO - Holds the record number of the current record being processed from the input record file.
THISPROCESS - Contains the process number of the current process when using intrastep distributed processing under CUBE Cluster.
Z - A copy of I.
ZONES - Number of zones.

Built-in functions

Described in more detail in COMP.

Matrix program built-in functions

ARRAYSUM() - Returns the value of the sum of an array’s values.
LOWEST() - Sum lowest valued cells in a row.
MATVAL() - Allow random access to values from MATIs.
ROWADD() - Add matrices.
ROWAVE() - Average cell value where value!=0.
ROWCNT() - Number of cells whose values != 0.
ROWDIV() - Divide one matrix by another.
ROWFAC() - Factors the row by fac.
ROWFIX() - Integerize mw (start at column intrazonal + 1).
ROWMAX() - Maximum value in the row.
ROWMIN() - Minimum value in the row.
ROWMPY() - Multiply one matrix by another.
ROWSUM() - Row total.
CHECKNAME(NAME) - Check if a matrix table/ zonal file field exists. If it does not exist, the function returns a zero. If it exists, the return code is a 2 digit number in the format of {t}{s}, defined as follows:

{t} is type

0=single variable

1=vector

2=user function

3=work matrices

4=user function

5=user function

6=vector with constant index and no default index

7=multidimensional array, no default index

{s} is storage size

1=numeric

2=string

e.g. 1=numeric variable, 2=string variable, 11=numeric array, 12=string array
GETVALUE(NAME {(, DEFAULTVALUE}) - Get a numeric value from a matrix table/zonal file field. This function is used along with the CheckName function. CheckName checks the existence of a matrix table/ zonal file field and GetValue extracts the value from the matrix table/ zonal file field. They are used to avoid model run crash due to non-existence or invalid value of a matrix table/ zonal file field in the input flies. Users can specify a default value in case of any invalid values.

Note: MI.x.x must be used with an explicit JLOOP block where the default index of I and J wil be set to correct origin and destination by the Matrix Program.

JLOOP
    IF CHECKNAME(‘MI.1HBW’) hbw
    = GETVALUE("MI.1.HBW’)
ENDJLOOP

GETMATRIXROW(NA ME ,MW# {(, DEFAULTVALUE}) - Load a MI matrix row into a MW. Name must be a MI.x.x name. The return code can be the following:

0=matrix row loaded successfully

1=invalid name or other errors, the default value is used

2=invalid name or other errors and no default value is supplied, program terminating

nn=GetMatrixRow(‘MI.1.NHB’,2,1)

Transposed matrices

A copy of a transposed input is obtained by referencing a variable with a name of MI.n.name.T. In Matrix program terminology, a transposed matrix is one that has had its rows and columns switched. See COMP for details.

Control statement types in Matrix program

There are several types of control statements used in the Matrix program:

Definition statements - Define static processes.

ARRAY
FILEI and + FILEO (which define the input and output data.)
LOOKUP
PARAMETERS
RENUMBER

Computational statements - Cause variable values to be updated.

Reporting statements - Cause values to be accumulated and/or displayed dynamically.

Flow control statements - Set which statement is to be performed next.

For more details about control statements, see Control Statements.

Working with intrazonal cells of a matrix

During the processing of demand data it is often necessary to access the intrazonal element of a matrix or to set its value. Special keywords INTRAZONAL or INTRA are provided to assist in this.

To set the intrazonal element of a matrix row, amend the normal COMP command to take one of the following forms:

INTRAZONAL MW[x]=expression
COMP MW[x][INTRAZONAL]=expression
COMP MW[x][INTRA]=expression

where x indicates the appropriate working matrix number. When such commands are executed, all elements of the matrix which lie off the diagonal are left unchanged.

Note that it is invalid to use the above forms of calculation with a JLOOP (as JLOOP implies varying the J, or column, whereas INTRAZONAL or INTRA fixes it to I, or the row index). Neither can it be used with the INCLUDE or EXCLUDE subkeywords of the M[] statement, which are used to select destination zones.

Such commands may be used in conjunction with the LOWEST function set an intrazonal cost based on the cost to the nearest zone(s). As an example:

INTRAZONAL mw[10]=0

INTRAZONAL mw[10]=LOWEST(10, 1, 0.01, 99999)

sets the intrazonal cost to the cost from the origin to the nearest destination, but ignoring destinations with costs less than 0.01 or more than 99999. The preceding setting of MW[10]’s intrazonal cell to zero ensures that any starting value in that cell does not become the result of the LOWEST function.

The INTRAZONAL or INTRA keywords may be used to access the diagonal element of a matrix during calculations. This keyword is coded in a similar manner to the j (or column) position when a matrix is referenced in an arithmetic expression. Examples are:

MW[1]=MW[1][INTRA]

which copies the intrazonal (or diagonal) element of the first working matrix into all column positions, and:

var1=MW[10][INTRAZONAL]

which sets a scalar variable var1 to the intrazonal element of working matrix ten, taken from the current row of the matrix.

Working with lists of zones

When modeling travel demand, it is often necessary to apply different treatments to various types of zones, and similarly to movement types within matrices. Examples of such zone groupings are the CBD (central business district), industrial zones, the suburbs, or zones corresponding to external cordon points. Any of these areas comprises a list of zone numbers; lists which are lengthy and used on many occasions in processing could easily be a source of errors in typing or updating. To avoid such difficulties, such lists of zones may be set up once, given a suitable descriptive name to identify them, and then used wherever appropriate in the modeling.

This section outlines two methods:

Working with lists of zones using the INLIST function - Uses the INLIST function to select required zones; it is simple to define lists, but their use is restricted to arithmetic computations.
Working with list of zones using the substitution method - Defines lists in the Pilot program, then uses the substitution facility to work with them; it is less easy to define the required lists, but they may be used in a wider range of contexts.

Working with lists of zones using the INLIST function

As an example, zones 1 to 12 and 15 to 20 form the CBD of a study area. A list called CBD may be defined, using the COMP or computation control statement. The list of zones is specified as text data which is enclosed in quotes (‘). The INLIST function is then used to construct a condition which determines whether its origin and/or destination are in the specified list. The INLIST function gives a value of 1 when the particular zone (first parameter) is found in the specified list (the second parameter). The particular zone under consideration is often i (the origin zone) or j (the destination zone). The following example illustrates the use of this facility:

CBD='1-12,15-20'
suburbs='23-30,34-41,43,47,50-57'
 ...
 IF (INLIST(i,CBD) = 1)
;    commands here will be processed for origins in the CBD
      ....
 ENDIF
....
JLOOP
      IF (INLIST (j,suburbs) = 1)
;      commands here processed for destinations in the suburbs
           ....
     ENDIF
ENDJLOOP
 IF (INLIST(i,suburbs) = 1)
     JLOOP
         IF(INLIST(j,CBD)=1)
;            commands are processed for flows which have both
;            origins in suburbs and destinations in  CBD
                ....
          ENDIF
     ENDJLOOP
ENDIF

Working with list of zones using the substitution method

As an example, zones 1 to 12 and 15 to 20 form the CBD of a study area. A list called CBD may be defined, using the COMP or computation control statement. The list of zones is specified as text data which is enclosed in quotes (‘). This is done in the script before any RUN PGM statements for programs, and forms part of the script of the Pilot program. Wherever this list of zones is required by a program, its script contains the name of the list, which is enclosed by @ symbols (denoting substitution of the list of zone numbers, in place of the list’s name). The example:

CBD='1-12,15-20'
...
RUN PGM = MATRIX
  ...
  MX[10]=mi.1.LOSmatrix
  ...
  JLOOP INCLUDE = @CBD@
    MX[10] = MX[10] + 10
  ENDJLOOP
  ...
  ...
ENDRUN

defines the list of zones in the CBD, then adds a parking cost of 10 units into the skim (or level of service) matrix for destination zones in that area.

The defined list may contain individual zone numbers and / or ranges of zone numbers; the latter are specified in the form 1-10 with neither space or ‘,’ (comma) between the start and end values. To ensure that the list is generally acceptable, it should be written with commas between items. (Although CUBE Voyager scripting allows spaces to be used as delimiters between items of a list, this form is not accepted by the conditional or IF statement which requires commas, and so use of commas is recommended.)

A further example changes that part of a matrix which corresponds to origins in the suburbs and destinations in the CBD, dividing these matrix cells by 1.05:

...
suburbs = '100-145,148-191,194-224,227-341'
CBD = '1-12,15-20'
...
RUN PGM = MATRIX
  ...
  if(i = @suburbs@)
    jloop
      if(j = @CBD@)
        mw[1]=mw[1]/1.05
      endif
    endjloop
  endif
  ...
  ...
ENDRUN

The calculation may be expressed more concisely as:

 if(i = @suburbs@)
    jloop include = @CBD@
      mw[1]=mw[1]/1.05
    endjloop
  endif

or even:

if(i= @suburbs@) mw[1]=mw[1]/1.05 include=@CBD@

The following illustrates the case where the calculation applies to all destination zones for selected origins:

...
CordonZones = '540-578'
...
RUN PGM = MATRIX
  ...
  if(i = @CordonZones@)mw[5]=mw[5] * 1.27
  ...
ENDRUN

Working with logit choice models

This section discusses logit choice models.

Topics include:

Introduction to choice modeling

The XCHOICE command in CUBE Voyager scripting provides powerful extensions to the core language designed to allow complex logit choice models to be specified easily.

XCHOICE was introduced in version 4.1 of CUBE Voyager to replace the CHOICE control statement. XCHOICE implements the same logit model structures as the original CHOICE statement but improves choice-model run times. A choice model implemented with XCHOICE should run significantly faster than the same model implemented with CHOICE.

Existing choice models that use the CHOICE statement will continue to run as scripted. However, Bentley recommends modifying existing models to use the XCHOICE command statement and take advantage of the improvements in run time. Note that some of the keywords are different with the XCHOICE command statement. When converting a logit model that uses CHOICE to use XCHOICE, you must make additional changes at the keyword level. For more information please refer to XCHOICE.

Logit choice models have a number of distinct possible outcomes (for example mode of travel), and the model estimates the probability of choosing each particular outcome. The alternatives are evaluated using either their costs or their utility values. Costs and utilities are related. As the utility of an alternative increases the alternative becomes more attractive, but an increase in cost makes the alternative less attractive. Apart from sign, the other difference is that a utility directly measures the users benefit, whereas the cost has to be multiplied by an appropriate coefficient (or scale parameter) before use in choice models.

The simplest choice model splits total travel demand between two alternatives (or modes); it is known as the binary choice model. This may be extended by adding further alternatives, so forming a multinomial model.

In practice when several alternatives exist, the alternatives are not always independent of each other. To overcome this hierarchic choice models are used which sub-divide the choice process into a sequence of decisions. Alternatives which are similar are grouped together to form sub-nests. Typical examples of sub-nests are public transport (which brings together various transit modes), or car use (which includes through-trip by car and park-and-ride). These sub-nests are then brought together in the main (or top- level) choice process. The choice process may be viewed as initially choosing between sub-nests (representing types of travel), and then choice at the sub-nest level which decides the mode used.

The simple choice and hierarchic models described above are forms of the "absolute" logit choice model.

An alternative form is the Incremental model (also known as the Pivot Point model) which has a different methodology. It uses data for demand (by alternative) in the base situation together with changes in costs (or utilities) between the base and forecast scenario, in order to re-allocate the demand between the alternatives in response to those cost changes.

The destination choice model is an extension to the logit choice concept where the alternatives are not modes of travel, but destination zones which the traveler chooses between. It may be combined with mode choice to form complex models, and may be used in absolute or incremental form.

The section uses a number of examples to illustrate use of the CHOICE command and to explain the underlying theory. The examples start at the simplest level and increase in complexity. They are:

Each model is described in the context of a typical problem, supported with the relevant theory. Then the method for implementing a solution in the Matrix program is shown, with example scripts. For some models, alternative strategies or more complex variations are also illustrated. Finally, any practical issues related to setting up such a model are discussed.

At the end of the section there are some general notes concerning coding of demand models.

For a detailed description of the demand modeling function syntax see XCHOICE.

Absolute logit model

This section discusses the absolute logit model. Topics include:

Introduction

This section introduces a straightforward example of an aggregate demand model. Suppose that in a transport system there are just two discrete competing modes—car and public transport (PT)— between a given set of origins and destinations. A user of such a system is said to have a binary choice, because there are just two alternatives (car and PT).

The following paragraphs explain how the absolute logit model can be applied to the problem of estimating the probability of choosing each mode, and in particular how this is implemented in CUBE Voyager.

Logit Choice model

The process begins by calculating the generalized cost of travel between each origin and destination by either mode. Usually, this cost is a linear combination of the monetary cost (fare, fuel, etc.) and time (walk, wait, interchange, in-vehicle-time, etc.). There may also be an additive constant to approximate those elements of the cost that cannot be readily quantified, for example the convenience of bus services, or the quality of railway rolling stock. Let’s call the cost of travel by car, C_car, and by PT, C_pt. Suppose that there is a total demand, D, that make such a journey in a given period.

For the sake of clarity, subscripts relating to the origin and destination zone have been omitted.

The Absolute Logit Model states that the probabilities of choosing car, P_car, and PT, P_pt, are given by the equations below.

(1)

(2)

Where λ is the scale parameter, of which more later.

The forecast demand for car is given by D_car and for PT, D_pt.

(3) D_car = DP_car

(4) D_pt = DP_pt

The model also calculates a composite cost (C) that represents the cost of the combined choice (in this case, car and public transport), where:

(5)

Where utilities are used instead of costs, this simple choice model does not require a distinct scale parameter, as its effects have already been incorporated into the utility values. Using utilities, the probabilities of choosing car, P_car, and P_pt, are:

(6)

(7)

the composite utility (or logsum) is:

(8)

and the equations for demand by alternative D_car, etc.) are as given above.

Scale parameter (cost-based models)

The behavior of the model is determined by a positive constant known as the scale parameter, called λ in the equations above. The graph below illustrates the model sensitivity with different values of λ.

If λ=0 the model is completely insensitive to cost, and demand is shared equally between each of the available choices. Notice that P_car=½ and P_pt=½ when λ=0.

As λ=0 increases, the sensitivity of the logit model increases, progressively allocating more demand to the choice with the lower cost. The figure Logit model sensitivity below shows how the model becomes more responsive to the difference in cost for λ=0=0.01, 0.02 and 0.04.

Finally, as λ=0 approaches infinity, the model will allocate all the demand to the alternative with the lowest cost.

The value of the scale parameter will depend on the nature of the choice, characteristics of the demand and the units of cost. The examples used here are for illustrative purposes and should not be adopted as default values.

Where choice models are based on utilities, there is no cost coefficient as it has already been combined with the actual cost to form the utility values. Thus for simple choice models, the scale parameter is not required. Scale parameters are used in more complex (or hierarchic) models, but their specification and values are different from the style of use in cost-base models.

Matrix script for cost-based model

This section describes how this example can be implemented using the XCHOICE command. The fragment of script below will run this model. Variable names have been chosen to match those used in the preceding equations.

; Absolute logit model
XCHOICE,
;     List choices
  ALTERNATIVES = car, pt,
;     Input total demand
  DEMANDMW = 10,
;     Input costs
  COSTSMW = 3, 4,
;     Forecast demand
  ODEMANDMW = 15,16,
;     Model structure
  SPLIT = TOTAL 0.02 car pt,
;     Forecast composite cost
  SPLITCOMP = 19,
;     Working matrices
  STARTMW = 30

The XCHOICE command comprises a number of clauses of the form keyword = value(s) each of which defines some aspect of the logit choice model. These specify what alternatives may be chosen, the inputs to the calculations, the resulting outputs, and the structure of the logit choice model.

The block begins with the ALTERNATIVES clause listing of the names of the alternatives. In this case the choices are car and PT. These names will be used later to define the model structure.

The model inputs are specified, starting with the total demand which is coded in the DEMANDMW clause. The specified input may represent a matrix of true demand (in trips), as shown here using MW[10], or it may be set to 1, in which case the output "demand" will be the probabilities associated with each alternative. Next, the generalized costs are specified for car, as matrix MW[3], and PT, as matrix MW[4]. These should be listed in the same order as the modes in the ALTERNATIVES clause.

Now the output variables are specified. These are forecast demand for each alternative, again specified in the same order as the ALTERNATIVES clause, so MW[15] will contain car trips and MW[16] PT trips.

Finally the structure of the choice model is defined. In this example the choice is between two modes, car and PT, but this may be extended to three or more alternatives.

The SPLIT clause defines the model’s structure in terms of the scale parameter (or coefficient of generalized costs) and the choices given in the ALTERNATIVES clause. The word TOTAL indicates that the entire input demand is to be split between the alternatives listed in this specification. The scale parameter has a value of 0.02 (or λ=0.02 in the above equations) and the choice is between the car and PT alternatives. In this script, the scale parameter is given as a numerical value but it is equally valid to use a variable instead. The forecast composite cost is output as MW[19] with the SPLITCOMP keyword. Both the forecast demand and composite cost are optional outputs, and either clause may be omitted from the script.

The calculations performed by the logit choice model require a number of working matrices (or MWs) to be allocated for the use of the XCHOICE command. The STARTMW clause specifies a working matrix number which is higher than that of any other working matrix referenced in the script. Working matrices from the STARTMW value upwards are used by the XCHOICE command, and should not be used elsewhere in the script. Where a Matrix program script contains several XCHOICE commands, the same STARTMW value may be used in all instances.

Matrix script for utility-based model

This section describes how this example can be implemented using the XCHOICE command. The fragment of script below will run this model. Variable names match those used in the preceding equations.

; Absolute logit model
XCHOICE,
;    List choices
  ALTERNATIVES = car, pt,
;    Input total demand
  DEMANDMW = 10,
;    Input utilities
  UTILITIESMW = 5, 6,
;    Forecast demand
  ODEMANDMW = 15,16,
;   Model structure
  SPLIT = TOTAL car pt,
;   Forecast composite utility
  SPLITCOMP = 18,
;   Working matrices
  STARTMW = 40

The XCHOICE command comprises a number of clauses of the form keyword = value(s), each of which defines some aspect of the logit choice model. These specify what alternatives may be chosen, the inputs to the calculations, the resulting outputs, and the structure of the logit choice model.

The block begins with the ALTERNATIVES clause listing of the names of the alternatives. In this case the choices are car and PT. These names will be used later to define the model structure.

The model inputs are specified, starting with the total demand which is coded in the DEMANDMW clause. The specified input may represent a matrix of true demand (in trips), as shown here using MW[10], or it may be set to 1, in which case the output "demand" will be the probabilities associated with each alternative. Next, the utilities for car, as matrix MW[5], and PT, as matrix MW[6]. These should be listed in the same order as the modes in the ALTERNATIVES clause.

Now the output variables are specified, as working matrix (MW) numbers. These are forecast demand for each alternative (MW[15] for car trips and MW[16] for PT, again specified in the same order as the ALTERNATIVES clause).

Finally the structure of the choice model is defined. In this example the choice is between two modes, car and PT, but this may be extended to three or more alternatives.

The SPLIT clause defines the model’s structure in terms of the choices given in the ALTERNATIVES clause; here the choice is between the car and PT alternatives. The word TOTAL indicates that this split divides the entire input demand between the specified alternatives. The forecast composite utility is output as MW[18] with the SPLITCOMP keyword. Both the forecast demand and composite utility are optional outputs, and either clause may be omitted from the script.

The calculations performed by the logit choice model require a number of working matrices (or MWs) to be allocated for the use of the XCHOICE command. The STARTMW clause specifies a working matrix number which is higher than that of any other working matrix referenced in the script. Working matrices from the STARTMW value upwards are used by the XCHOICE command, and should not be used elsewhere in the script. Where a Matrix script contains several XCHOICE commands, the same STARTMW value may be used in all instances.

Incremental logit model

This section discusses the incremental logit model.

Topics include:

Introduction

This example returns to the structure of the absolute choice example, but now forecasts the change in demand based on the change in cost from a known base situation. This is known as the incremental form of the logit model. The structure of the model shown below in Figure 1.

This model is developed, both using costs and utilities below.

Incremental logit choice model

The model inputs are base demand by mode D_car, D_pt, base costs by mode C_car, C_pt and forecast costs by mode C’_car, C’_pt. The change in cost is denoted by DC_car and DC_pt where:

(9)

(10)

The choice model now takes the form of the equation below where P’ denotes the forecast choice probability and λ is the scale parameter.

(11)

(12)

So that the forecast demand by mode D'_car, D'_pt is:

(13)

(14)

The incremental composite cost (DC) is given by:

(15)

When working with utilities, the above equations are adapted to reflect the absence of any scale parameter (as this is combined into the utility values). The utility differences are calculated as:

(16)

(17)

and the probabilities of using each alternative are:

(18)

(19)

The equations for base and forecast demand by mode are the same as for cost-based models. The utility-based form of the composite costs is given by:

(20)

Matrix script for cost-based model

Comparing the example code below with the first example (same structure, but an absolute logit model), one can see that only the model inputs change to construct an incremental model. The model requires base demand, base costs and forecast costs for both modes as input. The output composite cost is the incremental change in composite cost.

; Incremental logit model
XCHOICE,
;    List choices
  ALTERNATIVES = car, pt,
;    Input base demand by mode
  BASEDEMANDMW = 10, 11,
;     Input base costs by mode
  BASECOSTSMW = 20, 21,
;    Input forecast costs by mode
  COSTSMW = 30, 31,
;    Forecast demand
  ODEMANDMW = 2,3,
;    Model Structure
  SPLIT = TOTAL 0.02 car pt,
;    Forecast incremental composite cost
  SPLITCOMP = 7,
;    Working matrices
  STARTMW = 40

Alternative script using cost differences

The XCHOICE command allows cost changes to be given instead of base and forecast costs. This variant is particularly useful when the costs do not change for all modes. For example, suppose that the car cost is constant and that a series of tests are to be conducted for different PT scenarios. In this case, there is no need to calculate the car cost for each test. Instead, the change in car cost, DC_car, can be set to zero. Where the cost difference is zero, the numeric value may be specified as the cost difference for the alternative (rather than needing to construct a matrix of zero values).

The example excludes the calculation of incremental composite costs.

; Incremental logit model (specifying cost differences)
; Specify scale parameter (or cost coefficient) lambda = 0.02
XCHOICE
;    List choices
  ALTERNATIVES = car, pt,
;    Input base demand by mode
  BASEDEMANDMW = 10, 11,
Input CHANGE in cost (= ForecastCost - BaseCost)
  DCOSTSMW = 24, 25,
;    Forecast demand
  ODEMANDMW = 2,3,
;    Model Structure
  SPLIT = TOTAL lambda car pt,
;    Working matrices
  STARTMW = 30

Matrix script for utility-based models

 Incremental logit model
XCHOICE,

;    List choices
  ALTERNATIVES = car, pt,
;    Input base demand by mode
  BASEDEMANDMW = 10, 11,
;     Input base costs by mode
  BASEUTILSMW = 20, 21,
;    Input forecast costs by mode
  UTILITIESMW = 30, 31,
;    Forecast demand
  ODEMANDMW = 2,3,
;    Model Structure
  SPLIT = TOTAL car pt,
;    Forecast incremental composite utilities
  SPLITCOMP = 7,
;    Working matrices
  STARTMW = 50

Alternative script using differences in utilities

The XCHOICE command allows changes in utility to be given instead of base and forecast utilities. This variant is particularly useful when the costs do not change for all modes. For example, suppose that the car cost is constant and that a series of tests are to be conducted for different PT scenarios. In this case, there is no need to calculate the car utility for each test. Instead, the change in car utility, DC_car, can be set to zero.

The example excludes the calculation of incremental composite costs.

; Incremental logit model (specifying cost differences)
XCHOICE
;   List choices
  ALTERNATIVES = car, pt,
;    Input base demand by mode
  BASEDEMANDMW = 10, 11,
;   Input CHANGE in cost (= ForecastCost - BaseCost)
;   Car Utilities are unchanged, so are specified as
'0'
  DUTILSMW = 24, 25,
;    Forecast demand
  ODEMANDMW = 2,3,
;    Model Structure
  SPLIT = TOTAL car pt,
;    Working matrices
  STARTMW = 100
Hierarchical logit m

Hierarchical logit model

This section describes the hierarchical logit model. Topics include:

Introduction

The second example splits the PT mode of the first example into two distinct sub-modes; bus and train. This is an example of a Hierarchical Logit Model, which can be implemented in Absolute or Incremental form.

Hierarchic models group related choices together in nests (or hierarchies). In this example, bus and train are members of the public transport nest that is considered to be distinct from the (private) car mode.

In an absolute model, the choice probabilities are calculated by starting at the bottom of the tree and moving up the hierarchy, calculating the choice probabilities and the composite costs in each nest. In this model the process begins in the PT nest.

Firstly, conditional probabilities for each of the two PT modes P_bus|pt and P_train|pt and the composite PT cost are calculated within the lower nest with equations similar to those in the previous section. The composite PT cost will be used next to represent the cost associated with the combined PT choice.

The choice probabilities for car P_car and all PT P_pt can now be calculated using the technique described in the first example.

It is now possible to move back down the hierarchy forecasting demand for each mode with the information derived above, so that:

(21)

(22)

(23)

For incremental models, the calculations are again performed in two stages, using the equations of the Incremental Logit Choice Model. The first pass calculates conditional probabilities and composite costs working up the tree structure; then in the second pass working down the tree, the resulting probabilities are calculated.

Cost-based examples of hierarchic logit models

As before the total demand is specified together with generalized costs for each choice (car, bus and train).

A scale parameter must be associated each nest. In this example, the scale parameters are λ=0.02 in the upper nest (consistent with the previous example) and =0.03 in the lower nest. It is a result of the model theory that the value of the parameters must increase (or at least not decrease) as one moves down the hierarchy. That is to say, the model’s sensitivity to cost increases down the hierarchy.

Matrix script

The example below shows how the earlier absolute-choice example can be extended to include the lower nest in the hierarchy:

; Absolute hierarchical logit model
; Specify scale parameters
lambda = 0.02
mu = 0.03
XCHOICE,
;    List choices
  ALTERNATIVES = car bus train,
;    Input demand
  DEMANDMW = 1,
;    Input costs
  COSTSMW = 4, 5, 6,
;    Forecast demand
  ODEMANDMW = 14,15,16,
;   Model Structure
;     Top level nest
  SPLIT = TOTAL lambda car pt,
;    Forecast composite cost top level
  SPLITCOMP = 19,
;     PT nest
  SPLIT = PT mu bus train
;    Forecast composite cost PT level
  SPLITCOMP = 20,
;   Working matrices
  STARTMW =70

First the modes (car, bus and train) are declared with the ALTERNATIVES clause. Notice that PT is not declared as an alternative; this is the name given to the combined choice of bus and train.

Then the model inputs and outputs are specified. The input total demand and costs for the three distinct modes are given first, followed by the output forecast demand by mode.

The hierarchical structure is specified by moving down the tree describing each nest.

Beginning at the top of the tree, the first split command divides TOTAL (the total demand) into car and (all) PT with scale parameter λ=0.02. Notice that a scalar variable called lambda has been used to represent the scale parameter in this case. PT is not considered as an individual mode now, but as a link with the lower nest. The SPLITCOMP keyword computes the composite costs for the nest associated with this SPLIT, in this case the top level.

The second split command sub-divides PT trips between the bus and train alternatives using a scale parameter =0.03. The name pt is used as the first value in the SPLIT clause, and this acts as a link to the next level up the tree (where total demand was divided between car and PT). Another SPLITCOMP keyword for this SPLIT keyword computes the composite costs specific to the PT nest.

More complex absolute hierarchical models

The hierarchy can be extended with additional nests on either side of the tree. For example, a large absolute hierarchical logit model structure might have six choices: car, park and ride, bus, heavy rail, light rapid transit (LRT), and metro.

Matrix script illustrating the complex example

This example may be codes as an absolute model as below, with park and ride abbreviated as pandr, and heavy rail as hrail:

; Extract from a large absolute logit model
XCHOICE,
;    List choices
  ALTERNATIVES = car pandr bus hrail lrt metro,
;    Input demand
  DEMANDMW = 1,
;    Input costs
  COSTSMW = 11, 12, 13, 14, 15, 16,
;    Forecast demand
  ODEMANDMW = 21,22,23,24,25,26,
;    Model Structure
;      Top level nest
  SPLIT = TOTAL 0.02 allcar allpt,
;   All car nest
  SPLIT = allcar 0.05 car pandr,
;   All PT nest;
  SPLIT = allpt 0.03 bus train,
; Train nest
SPLIT = train 0.04 hrail lrt metro,
;   Working matrices
  STARTMW = 45

Utility-based examples of hierarchic logit models

Scale parameter in utility-based models

The hierarchic choice model comprises a main choice nests, and a number of sub-nests. As with cost-based models, there is a requirement that higher level nests are less sensitive to cost differences than any sub-nests which lie below them.

In the estimation of the utility equations used in the choice model coefficients are estimated for individual cost-terms (such as travel time, and fare) and for scale parameters. The scale parameter is a factor which is applied to the composite utility of a sub-nest before it is used in the choice process of the parent choice nest. To meet the sensitivity requirements noted above, the scale parameter must be greater than 0, and must not exceed 1.0.

Thus, the scale parameters of utility-based models are viewed as part of the specification of the output of a split process, and different values may apply to each sub-nest in a choice nest. Where the scale parameter for a sub-nest is 1.0 there is no need to specify its value as this the assumed default. Where a scale parameter applies to two or more sub-nests, it may be specified once and brackets used to group the relevant sub-nests, so avoiding repetition of the parameter.

Matrix script illustrating two-level hierarchy

Using the tree structure shown in Structure of hierarchical logit model as the basis of an absolute choice model, the total demand is specified together with the utility of each choice (car, bus and train).

A scale parameter of 0.6 is applied to the PT sub-nest when its composite (or logsum) utility is used in the calculations of the higher-level nest.

; Absolute hierarchical logit model
XCHOICE,
;    List choices
  ALTERNATIVES = car bus train,
;    Input demand
  DEMANDMW = 1,
;    Input costs
  UTILITIESMW = 4, 5, 6,
;    Forecast demand
  ODEMANDMW = 14,15,16,
;   Model Structure
;   Top level nest. PT sub-nest has scale parameter 0.6
  SPLIT = TOTAL 1.0 car 0.6 pt,
;   Forecast composite cost for the top level
  SPLITCOMP = 19,
;   PT nest
  SPLIT = PT bus train
;   Forecast composite cost for the PT nest
  SPLITCOMP = 20,
;   Working matrices
  STARTMW =70

First the modes (car, bus and train) are declared with the ALTERNATIVES clause. Notice that PT is not declared as an alternative; this is the name given to the combined choice of bus and train.

Then the model inputs and outputs are specified. The input total demand and utilities for the three distinct modes are given first, followed by the output forecast demand by mode.

The hierarchical structure is specified by moving down the tree describing each nest.

Beginning at the top of the tree, the first split command divides TOTAL (the total demand) into car and (all) PT, and specifies a scale parameter of 0.6 which applies to the PT sub-nest. PT is not considered as an individual mode now, but as a link with the lower nest. The composite costs for this SPLIT are output with SLITCOMP.

The second split command sub-divides PT trips between the bus and train alternatives, but no scaling parameters are used. The name pt is used as the first value in the SPLIT clause, and this acts as a link to the next level up the tree (where total demand was divided between car and PT). The composite cost for this SLIT are output with another SPLITCOMP keyword.

Matrix script illustrating the complex example

A more complex utility-based example uses the model structure illustrated in Structure of a large absolute hierarchical logit model. The example below is coded below in incremental form using utility differences:

; Extract from a large absolute logit model
XCHOICE,
;    List choices
  ALTERNATIVES = car pandr bus hrail lrt metro,
;    Base demand
  BASEDEMANDMW = 1, 2, 3, 4, 5, 6,
;    Utility differences
  DUTILSMW = 11, 12, 13, 14, 15, 16,
;    Forecast demand
  ODEMANDMW = 21,22,23,24,25,26,
;   Model Structure
;     Top level nest
  SPLIT = TOTAL 0.4 allcar 0.667 allpt,
;      All car nest
  SPLIT = allcar car pandr,
;      All PT nest
  SPLIT = allpt 1.0 bus 0.75 train,
; Train nest
  SPLIT = train hrail lrt metro,
;     Working matrices
  STARTMW = 45

In this example a scale parameter of 0.4 is applied to the all-car sub- nest; also 0.667 to the all-PT sub-nest and 0.75 to the train sub-nest.

Destination choice

This section describes the destination-choice model.

Topics include:

Introduction

This section shows how a logit model can be used to forecast destination choice.

Suppose that an aggregate demand model has 100 zones. Associated with each origin zone (denoted i) there is a total demand of Di that is to be distributed between the 100 possible destinations (denoted j) according to the cost of the trip C_ij. The figure Structure of destination choice model illustrates the structure, where d1, d2,… denote the choice of destination 1, destination 2 and so on.

For the absolute choice model, the probability of choosing destination zone j from origin zone i is given by P_ij:

(24)

Where λ is the scale parameter. This equation is no more than a generalized case of the equation in the absolute logit model that forecasts the demand for car and PT. In this case, the choices are destination zones.

The incremental logit model is also supported. Its underlying theory is similar to the equations in the incremental logit model, but with alternative modes replaced by destination zones. In this model, the base situation matrices of demand and cost (or cost differences) are also input rather than total travel demand. The incremental model is more widely used in studies than the absolute formulation.

For absolute utility-based models, the destination choice model takes the form:

(25)

The incremental forms of logit choice model is also supported, and uses the incremental choice equations with alternative modes replaced by different destination zones. The clauses defining data input reflect the data required by the model, for examples UTILITIESMW (utilities) and DUTILSMW (differences in utilities).

Matrix script

The destination choice model described above is coded in the script below.

;Simple destination choice model ;Specify choice parameter
lambda = 0.01
XCHOICE,
;   Alternatives (only one, as doing destination choice)
  ALTERNATIVES = all
;   Input demand from each origin
  DEMAND = TripsFromI[i],
;   Input cost matrix
  COSTSMW = 3,
;   Forecast demand from each origin to each destination
  ODEMANDMW = 7
;   Model Structure
  DESTSPLIT = TOTAL lambda all,
;   Working matrices
  STARTMW = 20

The extract begins with the specification of alternatives, which comprises just one alternative as we have no choice between modes. This is followed by the model inputs, in this case the demand is an array of trips from each zone and the generalized cost is matrix MW[3]. The outputs are a forecast demand matrix, MW[7].

The structure of this model is defined by the DESTSPLIT clause which defines a destination choice process. A scale parameter of λ=0.01 has been chosen for this example. The output from the split clause is the alternative "all," as listed on the ALTERNATIVES keyword.

The main differences for utility-based scripts are use of utility keywords (UTILITIESMW being used in place of COSTSMW), and no use of lambda, the scale parameter, so that the DESTSPLIT clause now becomes DESTSPLIT = TOTAL all.

Mode and destination choice

This section discusses the mode-and-destination-choice model.

Topics include:

Introduction

The XCHOICE command supports mode followed by destination choice, which is considered in this section.

The values of scale parameter for the various choice nests, which reflect sensitivity to cost differences, influence the effectiveness of the model.

Mode followed by destination choice

Consider this example of mode followed by destination choice. The figure illustrates a system with two modes, car and PT, and 100 destination zones (labelled d1, d2,…, d100).

Mode followed by destination choice

Here the total demand is split first by mode (into two vectors representing car and PT demand for each origin) then by destination (giving car and PT matrices).

Matrix Script

The script extract below is similar to previous examples shown in Incremental logit model and Destination choice.

The model structure specification splits the total demand by mode first with scale parameter λ=0.01, then across destinations for each mode individually. The parameters for destination choice are =0.02 and =0.03 for car and PT respectively.

; Mode choice above destination choice model
; Specify choice parameters
lambda = 0.01
mu = 0.02
theta = 0.03
XCHOICE,
;    List choices
  ALTERNATIVES = car pt,
;    Base Demand
  BASEDEMANDMW = 1,2,
;    Base Costs
  BASECOSTSMW = 11,12,
;    Forecast costs
  COSTSMW = 21,22,
;    Forecast demand matrices by mode
  ODEMANDMW = 31,32,
;    Model Structure
;      Mode choice
  SPLIT = TOTAL lambda destcar destpt
;      Car destination choice
  DESTSPLIT = destcar mu car,
;      PT destination choice
  DESTSPLIT = destpt theta pt,
;   Working matrices
  STARTMW=110
General notes

General notes

This section provides general notes about logit choice models:

Availability of choices

Within a demand model, it is often useful to make certain choices unavailable. For example, a rail mode might not be a practical alternative for travel between all zones in the study area. To make a choice unavailable you should give that choice a large positive generalized cost (or large negative utility). Large in this context is 100 times greater than typical costs. For example, if costs are up to 400 generalized minutes, try using a cost of 40000 minutes to make a particular choice unavailable.

Applying choice models to selected parts of matrices

There are instances when it is desirable to apply choice models to selected parts of a matrix, rather than the entire matrix. These typically arise when matrices can be broken down into distinct segments (such as trips entirely within study area, through trips etc.), and different choice models apply to these different segments. In some instances the model structure is common to all segments, but the sensitivity (and so cost-coefficient) varies between segments. Under these conditions, the cost coefficient may be specified as a matrix (such as MW[5], or mi.1.coefficient) rather than a scalar value. Where the model structure varies between segments, or the user wishes to apply the choice model separately for each segment, the following provides guidance on techniques used.

When a XCHOICE control statement occurs in the Matrix program it will typically be applied to all cells (or origin-destination pairs). The XCHOICE control statement may be coded inside a conditional test which selects the origin zones it is applied to:

;  test to see if model applies to this origin zone
if (i < 45)
;  if so, then apply the model
   XCHOICE,
;  subsequent clauses of XCHOICE statement, as needed
;  ...
endif

For simple choices between alternative modes, this may be extended, by use of the JLOOP command, to select particular rows and columns of the matrix where the choice model is applied:

;apply the model to trips from zones 1-45 to zones 46-53
if (i < 45)
    JLOOP INCLUDE = 46-53
        XCHOICE,
;       subsequent clauses of XCHOICE statement, as needed
;       ...
    ENDJLOOP
endif

Where the choice model applies to a segment which is not a regular set of rows and columns, but is determined by some other matrix attribute (such as distance from origin to destination), a script may be coded as follows:

;start a JLOOP to process each origin-destination pair in turn
JLOOP
;  test to see if O-D meets selection criteria for the segment
   if (mi.1.distance > 50)
;    if so, then apply the model
     XCHOICE,
;    subsequent clauses of XCHOICE statement, as needed
;    ...
   endif
ENDJLOOP

Where a model seeks to use destination choice, this cannot be coded inside a JLOOP construct. If only selected destinations are valid for the destination choice, then the INCLUDE or EXCLUDE sub- keyword should be specified immediately after the DESTSPLIT keyword in order to include or exclude the required zones.

Practical considerations: Incremental models

In an incremental model, if the base demand for a choice is zero, the forecast demand for that choice will always be zero.

This is made clear if one examines the equation in the Incremental logit model, where, if P_car=0 then P’_car=0 whatever costs are given. If this effect is a problem, then the modelling approach should be reviewed.

Practical considerations: Scale parameters

For models calibrated in terms of generalized costs to give sensible results, it is important to ensure for cost-based models that the size of the parameters increase (or at least not decrease) as one moves down the hierarchy. That is to say, the model’s sensitivity to cost increases down the hierarchy. Where these conditions are not met, an error message will be output.